retrained model
Ascent Fails to Forget
Contrary to common belief, we show that gradient ascent-based unconstrained optimization methods frequently fail to perform machine unlearning, a phenomenon we attribute to the inherent statistical dependence between the forget and retain data sets. This dependence, which can manifest itself even as simple correlations, undermines the misconception that these sets can be independently manipulated during unlearning. We provide empirical and theoretical evidence showing these methods often fail precisely due to this overlooked relationship. For random forget sets, this dependence means that degrading forget set metrics (which, for a retrained model, should mirror test set metrics) inevitably harms overall test performance. Going beyond random sets, we consider logistic regression as an instructive example where a critical failure mode emerges: inter-set dependence causes gradient descent-ascent iterations to progressively diverge from the ideal retrained model. Strikingly, these methods can converge to solutions that are not only far from the retrained ideal but are potentially even further from it than the original model itself, rendering the unlearning process actively detrimental. A toy example further illustrates how this dependence can trap models in inferior local minima, inescapable via finetuning. Our findings highlight that the presence of such statistical dependencies, even when manifest only as correlations, can be sufficient for ascent-based unlearning to fail. Our theoretical insights are corroborated by experiments on complex neural networks, demonstrating that these methods do not perform as expected in practice due to this unaddressed statistical interplay.
Extending Kernel Trick to Influence Functions
Sun, Zhenhuan, Valaee, Shahrokh
In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can be an efficient alternative to the original influence functions for estimating changes in parameters, model outputs and loss due to data point removal, when model size is large relative to dataset size, or when evaluating the original influence functions in parameter space is infeasible. The dual representation, however, is limited to linearizable models, which are models whose behavior can be approximated by their linearizations throughout training, and requires materializing a matrix, whose size grows with the product of model output dimension and dataset size.
On the Necessity of Output Distribution Reweighting for Effective Class Unlearning
Ebrahimpour-Boroojeny, Ali, Wang, Yian, Sundaram, Hari
In this paper, we reveal a significant shortcoming in class unlearning evaluations: overlooking the underlying class geometry can cause privacy leakage. We further propose a simple yet effective solution to mitigate this issue. We introduce a membership-inference attack via nearest neighbors (MIA-NN) that uses the probabilities the model assigns to neighboring classes to detect unlearned samples. Our experiments show that existing unlearning methods are vulnerable to MIA-NN across multiple datasets. We then propose a new fine-tuning objective that mitigates this privacy leakage by approximating, for forget-class inputs, the distribution over the remaining classes that a retrained-from-scratch model would produce. To construct this approximation, we estimate inter-class similarity and tilt the target model's distribution accordingly. The resulting Tilted ReWeighting (TRW) distribution serves as the desired distribution during fine-tuning. We also show that across multiple benchmarks, TRW matches or surpasses existing unlearning methods on prior unlearning metrics. More specifically, on CIFAR-10, it reduces the gap with retrained models by 19% and 46% for U-LiRA and MIA-NN scores, accordingly, compared to the SOTA method for each category.
Diluting Restricted Boltzmann Machines
Recent advances in artificial intelligence have relied heavily on increasingly large neural networks, raising concerns about their computational and environmental costs. This paper investigates whether simpler, sparser networks can maintain strong performance by studying Restricted Boltzmann Machines (RBMs) under extreme pruning conditions. Inspired by the Lottery Ticket Hypothesis, we demonstrate that RBMs can achieve high-quality generative performance even when up to 80% of the connections are pruned before training, confirming that they contain viable sub-networks. However, our experiments reveal crucial limitations: trained networks cannot fully recover lost performance through retraining once additional pruning is applied. We identify a sharp transition above which the generative quality degrades abruptly when pruning disrupts a minimal core of essential connections. Moreover, re-trained networks remain constrained by the parameters originally learned performing worse than networks trained from scratch at equivalent sparsity levels. These results suggest that for sparse networks to work effectively, pruning should be implemented early in training rather than attempted afterwards. Our findings provide practical insights for the development of efficient neural architectures and highlight the persistent influence of initial conditions on network capabilities.
MPRU: Modular Projection-Redistribution Unlearning as Output Filter for Classification Pipelines
Peng, Minyi, Gunamardi, Darian, Tjuawinata, Ivan, Lam, Kwok-Yan
As a new and promising approach, existing machine unlearning (MU) works typically emphasize theoretical formulations or optimization objectives to achieve knowledge removal. However, when deployed in real-world scenarios, such solutions typically face scalability issues and have to address practical requirements such as full access to original datasets and model. In contrast to the existing approaches, we regard classification training as a sequential process where classes are learned sequentially, which we call \emph{inductive approach}. Unlearning can then be done by reversing the last training sequence. This is implemented by appending a projection-redistribution layer in the end of the model. Such an approach does not require full access to the original dataset or the model, addressing the challenges of existing methods. This enables modular and model-agnostic deployment as an output filter into existing classification pipelines with minimal alterations. We conducted multiple experiments across multiple datasets including image (CIFAR-10/100 using CNN-based model) and tabular datasets (Covertype using tree-based model). Experiment results show consistently similar output to a fully retrained model with a high computational cost reduction. This demonstrates the applicability, scalability, and system compatibility of our solution while maintaining the performance of the output in a more practical setting.
Federated Unlearning Made Practical: Seamless Integration via Negated Pseudo-Gradients
Mora, Alessio, Mazzocca, Carlo, Montanari, Rebecca, Bellavista, Paolo
Abstract--The right to be forgotten is a fundamental principle of privacy-preserving regulations and extends to Machine Learning (ML) paradigms such as Federated Learning (FL). While FL enhances privacy by enabling collaborative model training without sharing private data, trained models still retain the influence of training data. Federated Unlearning (FU) methods recently proposed often rely on impractical assumptions for real-world FL deployments, such as storing client update histories or requiring access to a publicly available dataset. T o address these constraints, this paper introduces a novel method that leverages negated Pseudo-gradients Updates for Federated Unlearning (PUF). Our approach only uses standard client model updates, which are employed during regular FL rounds, and interprets them as pseudo-gradients. When a client needs to be forgotten, we apply the negation of their pseudo-gradients, appropriately scaled, to the global model. Unlike state-of-the-art mechanisms, PUF seamlessly integrates with FL workflows, incurs no additional computational and communication overhead beyond standard FL rounds, and supports concurrent unlearning requests. We extensively evaluated the proposed method on two well-known benchmark image classification datasets (CIF AR-10 and CIF AR-100) and a real-world medical imaging dataset for segmentation (ProstateMRI), using three different neural architectures: two residual networks and a vision transformer . The experimental results across various settings demonstrate that PUF achieves state-of-the-art forgetting effectiveness and recovery time, without relying on any additional assumptions. N today's digital landscape, privacy has become a major concern, as reflected by the emergence of robust regulatory frameworks worldwide [1]. The European Union (EU) has consistently emphasized the importance of protecting personal data, exemplified by the introduction of the General Data Protection Regulation (GDPR) in 2016 [2]. Most recently, in May 2024, the EU enacted Regulation 2024/1183 [3], establishing the European Digital Identity Framework that empowers individuals with fine-grained control over their information. One of the key rights of these regulations is the right to be forgotten, which allows individuals to request the deletion of their previously shared data. Similar rights are central to other major privacy laws worldwide, such as the California Consumer Privacy Act (CCP A) [4] where the right to delete grants California residents the on-demand removal of personal data held by businesses. Alessio Mora, Rebecca Montanari, and Paolo Bellavista are with the Department of Computer Science and Engineering, University of Bologna, Bologna, Italy (e-mail: {name.surname}@unibo.it).
Data Augmentation Improves Machine Unlearning
Falcao, Andreza M. C., Cordeiro, Filipe R.
Machine Unlearning (MU) aims to remove the influence of specific data from a trained model while preserving its performance on the remaining data. Although a few works suggest connections between memorisation and augmentation, the role of systematic augmentation design in MU remains under-investigated. In this work, we investigate the impact of different data augmentation strategies on the performance of unlearning methods, including SalUn, Random Label, and Fine-Tuning. Experiments conducted on CIFAR-10 and CIFAR-100, under varying forget rates, show that proper augmentation design can significantly improve unlearning effectiveness, reducing the performance gap to retrained models. Results showed a reduction of up to 40.12% of the Average Gap unlearning Metric, when using TrivialAug augmentation. Our results suggest that augmentation not only helps reduce memorization but also plays a crucial role in achieving privacy-preserving and efficient unlearning.